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INTRODUCTION 

In using computers to represent and manipulate Chinese characters, 
much effort to date has quite properly been devoted to the primary problems 
of compactness of coding, storage and retrieval, and character recognition 
[References 1 through 5]. Aesthetic problems in matters of typography and 
calligraphy have necessarily played a lesser role. Yet looking ahead to 
the time when the aesthetic quality of computer generated output may be 
allowed to assume a greater significance. It is clear that much work 
remains to be done. Our purpose in this paper is to call attention to the 
possibilities inherent in the use of spline functions to represent 
arbitrary graphical figures In general, and Chinese characters in 
particular. Although the techniques have quite wide applicability, we have 
chosen to illustrate their use in the treatment of hand-written characters 
input to a small computer system in real time. 

Cubic spline functions have properties that make them particularly 
suitable for approximating hand-drawn curves for which relatively few 
sample points are available. Most importantly they are the mathematical 
analog of splines used by draftsmen. A mechanical spline Is a thfn strip 
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of plastic or metal that can be constrained to a particular shape with lead 
weights (called *'dogs") that hold the strip In place. 




As a consequence of behaving like mechanical splines, cubic spline 
functions are smooth, and furthermore, a change tn the position of one of 
the dogs does not radically change the shape of the whole curve — 
properties not shared, for example, by interpolatory polynomials. 



Outline of the method 



Characters are written on a digitizing tablet which Is interfaced to a 
small computer system.'^ [Footnote: The system consists of: a Data General 
Nova 800 minicomputer to which we have interfaced an Imlac PDS-1 display 
processor and a Scriptographics tablet.] As each stroke, or, In the case of 
cursive writing, each character, is written, the pen position Is sampled 
periodically and the data temporar I ly stored. 
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FIGURE 1 
ORIGINAL TABLET COORDINATE DATA 



Thfs results in a relatively dense collection of coordinate pairs for each 
stroke.. Two problems imnied lately arise: 1)- for a 'sampling rate , high. 
enough tO' avoid 'loss, of smal 1 ' detal 1 (we used- 5 irii 1 1 Iseconds. between; 
samples), the number of points may run to several hundred per stroke — far 
too many for convenient storage; 2) the sampling process may result In an 
unintended roughness, especially for slowly written strokes. Before 
proceeding with the analysis, therefore, the data are thinned to a 
relatively small number of representative points. This should be done In a 
manner v^hlch relates the number of points to the curvature. We have chosen 
an heuristic procedure for obtaining representative points which selects 
more of them when the curvature Is high, less when the curvature Is low, 
and yet enough of them that small detail is not lost Vihen the stroke is 
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reconstructed. Figure 2 shov/s a set of the representative points^ as 
crosses, super imposed over the coordinate data in Figure 1. 




FIGURE 2 
REPRESENTATIVE POINTS 



All of- -the other data points are then discarded^ ■■'■To Teconst-r^^^ 
stroke, spline functions are used to interpolate a set of points between 
the .representative points (v/e used 5 such points)* .These are then joined 
by straight lines and displayed^ thus forming an approximation to the 
original stroke by linear segments* 
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• '<|J#4?' .%:"*" >u'r'"4': sv:'^^ a f \' .'^J"" ^'^^ > ''' .: ^tT- J 



'^A "^ ^ / 



FIGURE 3 
RECONSTRUCTED STROKE 



Thus^ a character or group of characters is formed by approximating 

and regenerating each stroke on the graphics display as It is, written* 
Bexause the computations are rapidj . this process is much like VvTiting 
characters using a pen and paper* 

Figures 4 and 5 are illustrative of characters generated using this 
method. 



The remainder of this paper is devoted to the details outlined here 

wi th the hope that others may be able to use these techniques for their own 
. appl icat ions*. 
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FIGURE 5 
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SECTION 1 
SPLINE FUNCTIONS 

Since the modern mathematical theory of spline approximation was 
introduced by L J. Schoenberg in 19^6 [Reference 6], many books and 
numerous articles have appeared in the literature dealing with their 
properties and characteristics as approximants [References 7, 8 and 9l • In 
this paper we will restrict our attention to cubic natural splines. For a 
more general treatment of spline functions, please refer to the 
aforementioned references. 

Definition; Cubic Natural Spline. 



Let 



J ^ '•*'i»^2*3*****n 



be a set of real numbers (referred to as knots) such that 



^<^Hr 



A function S(t) is called a cubic natural spline if It satisfies: 

1) S is a polynomial of degree 3 on each interval (t,,t. ,)• 

2) S and its first and second derivatives are continuous 
everywhere. 
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3) In the intervals (-«>,tj), (t^,+«), S is linear. 



(Comnx)nly called the natural end conditions.) 



Pictorial ly: 



^i") m - 





^^3 '" h- 



-j-'-^^^^ 



^ru 



Here P. (t) denotes the i^ 
polynomial. Note that if all the 
P. are equal, the function S(t) 
reduces to a single cubic 
polynomial in the interval (t:j,t^). 
It is for this reason that spline 
functions are often called a 
generalization of polynomial 
functions. 



It follows easily from this definition that S(t) can be written in the 



form: 



(^) 



2- I u It u 

a,(t-t.)l where (u)^ = 

. , ' '^ if u 

1=1 



u if u > 

< 0. 



It also fol lows simply from the definition that there is a unique 
solution to the Interpolation problem. That is, tf we constrain S to take 
on the values y. at the knots t.i = 1 ,2,3,.. .n, there is a unique natural 
cubic spline function S, satisfying S(t.) = y.. Furthermore, S Is the 
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function in AC (the class of all functions with absolutely continuous 



first derivatives) that minimizes 



tn 
[f''(t)2]dt 

^1 

and interpolates the y. . This integral ts commonly called the strain 
energy and that cubic splines possess this property accounts for their 
smoothness. It should be noted that by simply applying the interpolation 
constraints and the natural end conditions to (*), the resultant 
(n+2)X(n+2) linear system can be solved for b.,b^, and a. (i = l,2,3,.**n), 
thereby determining the spline. This procedure Is to avoided, however, 
because the linear system is numerical ly unstable. There are much better 
ways to compute splines, one of which is detailed in section (3) of this 
paper. 
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SECTION 2 
PARAMETRIC REPRESENTATION OF THE STROKES 

To allow approximation and regeneration of non-functional strokes such 
as the curve In Figure 7, It is convenient to represent them 
parametrlcal ly. , 




FIGURE 7 



For example: X.(t) and Y(t) shown in Figures 8 and 9 wi 1 1 generate the loop 
In Figure 7. A t 





->^ 



FIGURE 8 
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We represented each stroke by obtaining tv;o functions X(t) and Y(t) 
which are the natural cubic spline functions that interpolate the 
representative point data x. and y, at the parametric values I. Of course, 
the function F = (X(t),Y(t)) interpolates the x.,yj where l « 1,2, 3, ...n, 
but as a consequence of this particular representation F is not a cubic 
spline Itself, F is rather a function v/lth first derivative and curvature 
continuity. This scheme works well visually and the loss of the second 
derivative continuity does not noticeably affect the aesthetic appearance 
of the curve. We chose the rather simple scheme of parameterizing by knot 
number for several reasons. First, at least visual ly we could not see the 
difference between this choice and other computationally more complicated 
schemes such as parameterizing by length between knots. Secondly, this 
scheme simpl if ies the computation considerably when solving the linear 
systems for the functions X(t) and Y(t). Finally, when one plots a 
constant number of short 1 ine segments between knots in the parameter space 
t, the corresponding line segments in the {x,y) space will be short when 
the curvature is high and longer when the curvature is low. Aesthetically 
this Is a very desirable property, and curves whose plots have this 
property are called "fair'* in the sense of Forrest [Reference 12]. 
**Fairness*' is a consequence of the knot placement heuristic which selects 
knots (representative points) more densely when the curvature is high. 
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SECTION 3 
COMPUTATION OF THE SPLINES X(t) AND Y(t) 

In this section a method is presented to solve the following problem: 



Given n representative points y. where i = l,2,3,.».,n and the n 
values T= (1 ,2,3f . .,n}, compute the cubic natural spline 
function that interpolates the y. with knots T. 



The general formulation of this method for computing interpolatory spline 
functions can be found in Reference 10 (see References at the end of this 
memo). Because the knots are at Integral values, the method becomes 
computationally particularly simple and rapid. For simpl ici ty, method is 
presented in terms of Y(t). Computation of X(t) is the same. 

Because a cubic spl ine function is a cubic polynomial in each interval 
(i,i+l), letting P. (t) denote the polynomial in the interval (i|i+l), P.(t) 
can be represented by its Taylor series: 

Pj(t) - yj + P!(i)(t-I) + Py(i)(t-i)V2! +>•••(!) (t-i)V3! 

Hence, if we can determine Pj(i),Py(i) and PV*(i) for i = 1 ,2,3f . . .n-1 , the 
spline function will be completely specified. Using the continuity 
constraints associated with cubic splines, it is easy to show that the 
following relationships hold: 



1) Pl(i) = Ay. - (2Py(l) + Py^,(l+l))/6 
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2) p"»(i)= py^, (i+1) - py(i) 



3) PV_i(i-1) + '♦P'i'(l) + P'i'+,(i+l) = 6a^}-i 



where A is the difference operator Ay. = y. . - y 



I 



The basic idea behind the method can now be stated: 

Using recursion relationship (3) and the fact that P*'(l) - P''(n) = 
(since the spl ine is natural ) obtain a linear system of equations with 

c 

the P**. as unknowns. Then solve this system. Since relationships 
(1) and (2) only involve the y? and the second derivatives of the 
polynomials in adjacent intervals, the first and third derivatives for 
each polynomial can be found. Thus, the spline is completely 
specified over the whole interval (1 ,n) by obtaining the Taylor series 
for each of the n-1 cubic polynomials that make up the spline. 

The linear system arising from the recursion relationship (3) and the 
natural end conditions, is: 
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(*) 



ij 1 
I A.I 
14 10 





. ] k 10 
.0141 
.0014 



P^'(2) 
P^'(3) 
PV'C*) 



P'L3(n-3) 
P^'.2(n-2) 

p;,'_,(n.i) 



6A2y, 



6A y„_^ 
^n-3 



6A^y . 



6a y 



n-2 



Because this system of equations is diagonally dominant, It Is 
numerically stable and can be solved by any of the standard techniques. 
However, due to the particularly simple form of the coefficient matrix and 
our desire to make the computation rapid, we choose to use Gaussian 
elimination and back substitution to obtain the second derivatives. 

The details of the Gaussian elimination are as follows: 

(Consider the left-hand side of (^0 first,) 



Multiply the first row of (-) by \/k and subtract row 1 from row 

2* This eliminates the 1 In column 1 of row 2. The new diagonal 

element in row 2 is v^ « ^ - 1/A. Next, row 2 is multiplied by 

1/v^ and subtracted from row 3« This el Imlnates the 1 in column 

2 of row 3 and changes the diagonal element In row 3 to be 

1 



v^ = 4 - I/V2 = h 



k - lA 
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Continuing this process, It is clear that the fol lowing recursion 
relationship holds for the diagonal elements: 



ss ii V - k " \/\/ 



and all of the ones in the lower triangular part of the 
coefficient matrix will be eliminated. That is, the coefficient 
matrix will have the form: 




\ 



n-2 



If one knows the maximum size of the linear system to be solved 
(as we did in this work) all of the v^ can be calculated in 
advance and stored in a table. 

The row operations described above also affect the right hand 
side of {*). 



Letting b. denote the I element of the right hand side of (-), 
and after the row operations have been performed, we have: 
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2 2 

b. . - 6A y, . - b./v. with b, = 6a y.. 



Having computed the right hand side of (") , the back substitution 
proceeds: 



P'* = b /v 
n-1 n-2^^n-2 



and 



P'l(^) = (b.^, - P|Vl^»"*"'^^/^|H 



and we are done with the second derivative calculations. 

Estimating the computation time required for the calculation of the n 
second derivatives (assuming addition time = subtraction time and 
multiplication time = division time, and neglecting single operations) we 
have: 



2 s/ 
to compute the A y. requires 3n additions; 



s/ 

to compute the b. requires n additions and 2n multiplications; 



the back substitution requi res n additions and n multipl ications. 

Summing these, we conclude that the second derivative computation requires 
5n additions and 3n multiplications. 
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From relationships (1) and (2) In this section, we conclude that 
calculation of the third derivatives of the spline requires n additions, 
and the calculation of the first derivatives requires 2n additions and 2n 
multiplications. Hence the total spline computation uses 8n additions and 
5n multiplications. For example, using our Nova 800 computer, the 
computation of X(t) and Y(t) for a typical stroke (10 representative 
points) requires approximately 50 milliseconds. 
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CONCLUSION 

The calculating power of computers of modest size Is sufficient to 
permit the representation of hand-written characters by spline function 
approximations in real time. Further application might involve tracing or 
other methods of coordination. The method requires the retention of only 
relatively few coordinate pairs representative of the original data, yet 
make possible reconstructions of arbitrary scale in which the visual 
quality of the reconstructed curve is limited only by the number of 
interpolated points. The required calculations are tractable and fast. 

One must look beyond the poorly written characters we have represented 
here in illustration of the basic techniques. Further refinements may lead 
to a set of practical tools which, under the guidance of accomplished 
cal 1 igraphers and typographers, will help to enhance the quality of 
computer generated spl ine characters or pictures in future systems. 




FIGURE 9 - A Spline Generated Man 
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R. Flegal 

A technique for rapid selection of 
knots for subsequent regeneration of curves and 
areas with spline functions. 

The problem of finding the best (say in the least squares sense) 
approximation to an arbitrary function by an interpolatory cubic spline 
function with a fixed number of knots is non-linear and hence requires 
much computation. In an interactive enviornment where speed is often 
everything and the number of knots is large ( as it is with hand-sketched 
curves of arbitrary complexity ), standard techniques are insufficient. 
By relaxing the constraints that the approximation be mathematically "best" 
and that the number of knots be fixed I have devised a technique for selecting 
the placement of the knots that is fast (approximatly one millisecond per 
data point ) and whose resultant spline approximations fit (at least 
within visual accuracy ) a wide variety of hand sketched data. The 
technique also selects "few" knots relative to the amount of original data... 
thus achieving significant data compression and speed in computation 
when regenerating the curves. I typically achieve compressions of 
one hundered to one. The technique will not approximate all curves 
equally well ( tends by itself to miss small corners ), but as the knot 
selection process eventually leads to a closed form mathematical representation 
of the curve , it is possible to precisly edit those curves whose 
approximation "misses" 

The technique 
Let (x.jy.) i=0,l,2,... 5N be the original unthinned tablet data 
then: 



1. pi ace the first knot at (xq^Yq) and let (Xi sYi )=(xQ5yQ). 

2. at each subsequent data (x.jy. ) i=L,L+l •.• calculate the 
area of the triangle formed by the last knot (x, ,y,) and the points 
{x.-5yJ »{x. .-, ,y. .•,) and call this area E.. Also form the approximation 
A= y^E. . At the point (x.,y.),A approximates 1 (f(x)-s(x)) dx where 
s(x) is the straight line passing through (x. sy, ) and (x. ,y. )• 

3. Choose a knot at (x. ,y.) if either: 

(A^C)or{E,-:E,.,+E,_2')andASA^.„) 

The constants C,c>;and A .-^ are somev/hat application and user dependant, 

however I have found that =l,A=l/4000 of the total area of the tablet measured 

in square resovable units,A_.= 1/20000 of the total area of the tablet measured 
^ min 

in the same units are good choices of these constants for most free-hand 
sketching systems including the Chinese character creation system. 
In summary, in this technique: 

A is a measure of the total "curvyness" 
betv/een knots. 

The condition E.> ^i-i"^^i-.2 P^'^'^^ 
up abrupt changes in curvature of the 
tablet data. 

A ,. is simply included so that noise in the 
mm "^ ^ 

tablet data will not affect the knot selection 
process when near a knot. 



